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» The MAILING DATE of this communication appears on the cover sheet with the correspondence address - 
Period for Reply 

A SHORTENED STATUTORY PERIOD FOR REPLY IS SET TO EXPIRE 3 MONTH{S) FROM 
THE MAILING DATE OF THIS COMMUNICATION. 

- Extensions of time may be available under the provisions of 37 CFR 1.136(a). In no event, however, may a reply be timely filed 
after SIX (6) MONTHS from the mailing date of this communication. 

- If the period for reply specified above is less than thirty (30) days, a reply within the statutory minimum of thirty (30) days will l>e considered timely. 

- If NO period for reply is specified above, the maximum statutory period will apply and will expire SIX (6) MONTHS from the mailing date of this communication. 

- Failure to reply within the set or extended period for reply will, by statute, cause the application to become ABANDONED (35 U.S.C. § 1 33). 
Any reply received by the Office later than three months after the mailing date of this communication, even if timely filed, may reduce any 
earned patent term adjustment. See 37 CFR 1 .704(b). 

Status 

I )S Responsive to communication(s) filed on 1 3 September 2004 , 
2a)\3 This action is FINAL. 2b)K This action is non-final. 

3) n Since this application is in condition for allowance except for formaf matters, prosecution as to the merits is 

closed in accordance with the practice under Ex parte Quayle, 1935 CD. 11, 453 O.G. 213. 

Disposition of Claims 

4) ^ Claim(s) 1-25 is/are pending in the application. 

4a) Of the above claim(s) is/are withdrawn from consideration. 

5) n Claim(s) is/are allowed. 

6) ^ Claim(s) 1-25 is/are reiected. 
?)□ Claim(s) is/are objected to. 

8) n Claim(s) are subject to restriction and/or election requirement. 

Application Papers 

9) n The specification is objected to by the Examiner. 

10)IEI The drawing(s) filed on 13 September 2004 is/are: a)S accepted or b)^ objected to by the Examiner. 
Applicant may not request that any objection to the drawing(s) be held in abeyance. See 37 CFR 1 .85(a). 
Replacement drawing sheet(s) including the correction is required if the drawing(s) is objected to. See 37 CFR 1.121(d). 

I I )□ The oath or declaration is objected to by the Examiner. Note the attached Office Action or form PTO-1 52. 

Priority under 35 U.S.C. § 119 

12)n Acknowledgment is made of a claim for foreign priority under 35 U.S.C. § 1 19(a)-(d) or (f). 
a)n All b)n Some * c)^ None of: 

1 .□ Certified copies of the priority documents have been received. 
2.n Certified copies of the priority documents have been received in Application No. 



30 Copies of the certified copies of the priority documents have been received in this National Stage 
application from the International Bureau (PCT Rule 17.2(a)). 
* See the attached detailed Office action for a list of the certified copies not received. 
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DETAILED ACTION 

This action is responsive to communications: amendment, filed 13 September 2004, to 
tlie original application filed 4 IVlay 2001. 

Claims 1-25 are pending. Claims 1, 5, 9, 13, and 15 are independent claims. Claims 
18-25 are newly added claims. 

Response to Arguments 

Applicant's arguments, see amendment, filed 13 September 2004, with respect 
— to 35 U.S.C. 10.1 have_ been fully considered and are persuasive. The rejections of 
claims 1, 5, 9, and 15 under 35 U.S.C. 101 have been withdrawn. 

Applicant's arguments, see amendment, filed 13 September 2004, with respect 
to the rejection(s)of claim(s) 1-17 under 35 U.S.C. 103(a) have been fully considered 
and are persuasive. Therefore, the rejection has been withdrawn. However, upon 
further consideration, a new ground(s) of rejection is made in view of newly found prior 
art. 



Claim Rejections - 35 USC § 103 
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The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, jf the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

Claims 1, 3, 5, 7. 9, 11, 13, 15, and 17-25 are rejected under 35 U.S.C. 103(a) as 
being unpatentable over Pirolli (U.S. Patent 5,895,470) in view of Call (U.S. Publication 
2002/0165707 A1). 

As per claims 5, 9, and 13, Pirolli discloses an apparatus, program instructions 
and method of converting, organizing, and representing in a computer memory a 
document corpus containing an ordered number of documents (See Pirolli, Column 7, 
lines 35-39). Pirolli does not disclose expressly developing a first uninterrupted listing of 
integers to correspond to an occurrence of terms in the document corpus. Call discloses 
developing an uninterrupted array of integers corresponding to an occurrence of terms 
(See Call, Figure 1, element 135, and Page 3, paragraph 0029). Pirolli and Call are 
analogous art because they are from the same field of endeavor of processing 
electronic text data. At the time of the invention it would have been obvious to a person 
of ordinary skill in the art to include the array of integers corresponding to an occurrence 
of terms of Call with the method of Pirolli. The motivation for doing so would have been 
to permit more efficient execution of processing functions of the type typically performed 
by data processors (See Call, Page 1, paragraph 0010). Therefore, it would have been 
obvious to combine Call with Pirolli for the benefit of permitting more efficient execution 
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of processing functions of the type typically perfonned by data processors to obtain the 
invention as specified in claims 5, 9, and 13. 

As per claim 15, Pirolli discloses data converter for organizing and representing 
in a computer memory a document corpus containing an ordered number of documents, 
for use by a data mining applications program requiring occurrence-of-terms data (See 
Pirolli, Column .13, lines 18-46), the representation to be based on terms in a dictionary 
previously developed for the document corpus and where each term in the dictionary 
has associated therewith a corresponding unique integer (See Pirolli, Pages 6-7, 
paragraphs 0076-0083). ). Pirolli also discloses means for developing an uninterrupted 
listing of the unique integers to correspond to the occurrence of the dictionary terms in 
the document corpus (See Pirolli, Column 7, lines 33-62). Pirolli does riot disclose 
expressly developing an uninterrupted listing of integers t o co rrespond to an occurrence 
of dictionary terms in the document corpus. Call discloses developing an uninterrupted 
array of integers corresponding to an occurrence of terms (See Call, Figure 1, element 
135, and Page 3, paragraph 0029). Pirolli and Call are analogous art because they are 
from the same field of endeavor of processing electronic text data. At the time of the 
invention it would have been obvious to a person of ordinary skill in the art to include the 
array of integers corresponding to an occurrence of terms of Call with the method of 
Pirolli. The motivation for doing so would have been to permit more efficient execution 
of processing functions of the type typically performed by data processors (See Call, 
Page 1 , paragraph 0010). Therefore, it would have been obvious to combine Call with 
Pirolli for the benefit of permitting more efficient execution of processing functions of the 
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type typically performed by data processors to obtain the invention as specified in claim 
15. 

As per claim 17. Pirolli and Call disclose the limitations of claim 15 as described 
above. Pirolli also discloses developing an uninterrupted listing for the entire document 
corpus, the uninterrupted listing containing, in sequence, the location of each 
corresponding document in the first uninterrupted listing (See Pirolli, Page 5, paragraph 
0051). 

As per claim 1 , Pirolli discloses method of converting a document corpus 
containing an ordered number of documents into a compact representation in memory 
of occurrence data (See Pirolli, Column 7, lines 35-39). Pirolli does not disclose 
expressly developing a first vector for the entire document corpus, the first vector being 
a listing o f inte g ers corre sponding to terms in the documen t s su ch that eac h do cument 
in the document corpus is sequentially represented in the listing. Call discloses 
developing an uninterrupted array of integers corresponding to an occurrence of terms 
(See Call, Figure 1, element 135, and Page 3, paragraph 0029). Pirolli and Call are 
analogous art because they are from the same field of endeavor of processing 
electronic text data. At the time of the invention it would have been obvious to a person 
of ordinary skill in the art to include the array of integers corresponding to an occurrence 
of terms of Call with the method of Pirolli. The motivation for doing so would have been 
to permit more efficient execution of processing functions of the type typically performed 
by data processors (See Call, Page 1, paragraph 0010). Therefore, it would have been 
obvious to combine Call with Pirolli for the benefit of permitting more efficient execution 
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of processing functions of the type typically perfonned by data processors to obtain the 
invention as specified in claim 1 . 

As per claims 3, 7, and 1 1 , Pirolli and Call disclose the limitations of claims 1 , 5. 
and 9 as described above. Call also discloses rearranging, or sorting, in the first vector, 
the order of the unique integers within the data for each document so that the terms are 
in alphabetical order which would case all identical unique integers to be adjacent (See 
Call. Page 5, paragraph 0051). Pirolli and Call are analogous art because they are from 
the same field of endeavor of processing electronic-text data. At the time of the 
invention it would have been obvious to a person of ordinary skill in the art to include the 
sorting of terms of Call with the method of Pirolli and Call. The motivation for doing so 
would have been to allow the terms to be displayed in sorted order (See Call. Page 5, 
paragraph 0051). Therefore, it would have been obvious to combine Call with PJrolli 
and Call for the benefit of allowing the terms to be displayed in sorted order to obtain 
the invention as specified in claims 3, 7, and 11. 

As per claims 18. 20, 22, and 24, Pirolli and Call disclose the limitations of claims 
1 , 5, 9 and 13 as described above. Call also discloses developing a dictionary, or term 
table, including terms contained in the document corpus and associating with each 
dictionary term, an integer to be uniquely corresponding to the dictionary term, the 
uniquely corresponding integers used in the first uninterrupted listing (See Call, Pages 
6-7. paragraphs 0076-0083). Pirolli and Call are analogous art because they are from 
the same field of endeavor of processing electronic text data. At the time of the 
invention it would have been obvious to a person of ordinary skill in the art to include the 
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term table of Call with the method of Pirolli and Call. The motivation for doing so would 
have been to allow a user to search the text for a term matching a particular term (See 
Call, Page 7, paragraph 0082). Therefore, it would have been obvious to combine Call 
with Pirolli and Call for the benefit of allowing a user to search the text for a term 
matching a particular term to obtain the invention as specified in claims 18. 20, 22, and 
24. 

As per claims 19, 21 , 23 and 25, Pirolli and Call disclose the limitations of claims 
1 , 5, 9 and 13 as described above. Pirolli also discloses developing a second 
uninterrupted listing for the entire document corpus, the second uninterrupted listing 
containing, in sequence, the location of each corresponding document in the first 
uninterrupted listing (See Pirolli, Column 7, lines 33-62). 

Claims 2, 6, 10, 14, and 16 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Pirolli (U.S. Patent 5,895,470) in view of Call (U.S. Publication 
2002/0165707 A1)as applied to claims 15, 19, 21, 23, and 25 above, and further in 
view of Cohen (U.S. Patent 5,950.189). 

As per claims 2, 6, 10, 14, and 16, Pirolli and Call disclose the limitations of 
claims 15, 19, 21 , 23, and 25 as described above. Pirolli and Call do not disclose 
expressly developing a third uninterrupted listing for the entire document corpus, the 
third uninterrupted listing containing a sequential listing of floating point multipliers, each 
floating point multiplier representing a document normalization factor for a 
corresponding document in the document corpus. Cohen discloses developing a 
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normalized vector containing floating point multipliers (See Cohen, Column 1 1 , lines 1- 
39). Pirolli, Call and Cohen are analogous art because they are from the same field of 
endeavor of processing electronic text data. At the time of the invention it would have 
been obvious to a person of ordinary skill in the art to include the normalized vectors of 
Cohen with the method of Pirolli and Call. The motivation for doing so would have been 
to accurately identify the high matches of document terms and their values (See Cohen, 
Column 9, lines 28-30). Therefore, it would have been obvious to combine Cohen with 
Pirolli and Call for the benefit of accurately identifying the high matches of document 
terms and their values to obtain the invention as specified in claims 2, 6, 10, 14 and 16. 

Claims 4, 8, and 12 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Pirolli (U.S. Patent 5,895.470) in view of Call (U.S. Publication 2002/0165707 A1) 
and Cohen (U.S. Patent 5,950,189) as applied to claims 2, 6, and 10 above, and further 
in view of Jagadish (U.S. Patent 6,401 ,088 B1 ). 

As per claims 4, 8, and 12, Pirolli, Call and Cohen disclose the limitations of 
claims 2, 6, and 10 as described above. Pirolli, Call and Cohen do not disclose 
expressly that the normalization factor is the number of occurrences of a specific term in 
the document that represents the reciprocal of the square root of the sum of squares of 
all term occurrences in the document. Jagadish discloses calculating a normalization 
factor using an algorithm that can be refined to determine the number of term 
occurrences in a document (See Jagadish, Figure 6, and Column 8, lines 14-46). 
Pirolli, Call, Cohen and Jagadish are analogous art because they are from the same 
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field of endeavor of processing electronic text data. At the time of the invention it would 
have been obvious to a person of ordinary skill in the art to include the normalization 
factor of Jagadish with the method of Pirolli, Call and Cohen. The motivation for doing 
so would have been to obtain a quick estimate of the number of times a particular 
substring, or term, occurs (See Jagadish, Column 1, lines 23-24). Therefore, it would 
have been obvious to combine Jagadish with Pirolli, Call and Cohen for the benefit of 
obtaining a quick estimate of the number of times a particular substring, or term, occurs 
to obtain the invention as specified in claims 4, 8 and 12. 

Conclusion 

The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. 

• Tsai (U.S. Patent 5,818,877) discloses a method for reducing storage 
requirements for grouped data values. 

• Wacholder (U.S. Patent 6,167,368) discloses a method and system for 
identifying significant topics of a document. 

• Appelt (U.S. Patent 6,601 ,026 B2) discloses information retrieval by 
natural language querying. 

• Mizutani (U.S. Patent 5,749,953) discloses a document search method. 

• Hazelhurst (U.S. Patent 5,974,412) discloses an intelligent query system 
for automatically indexing information in a database and automatically 
categorizing users. 
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Buckley discloses the optimization of inverted vector searches. 



• Lee discloses combining multiple evidence from different properties of 



weighting schemes. 



• Viles discloses term weights in dynamic information retrieval. 



Fagan discloses automatic phrase indexing for document retrieval. 



Any inquiry concerning this communication or earlier communications from the examiner 
should be directed to Laurie Ries whose telephone number is (571 ) 272-4095. If 
attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supen/isor, Joseph Feild, can be reached at (571) 272-4090. 

Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published 
applications may be obtained frohi either Private PAIR or Public PAIR. Status 
information for unpublished applications is available through Private PAIR only. For more 
information about the PAIR system, see http://pair-direct.uspto.gov. Should you have 
questions on access to the Private PAIR system, contact the Electronic Business Center 
(EBC) at 866-217-9197 (toll-free). 
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